Mining for Practices in Community Collections: Finds From Simple Wikipedia

نویسندگان

  • Matthijs den Besten
  • Alessandro Rossi
  • Loris Gaio
  • Max Loubser
  • Jean-Michel Dalle
چکیده

The challenges of commons based peer production are usually associated with the development of complex software projects such as Linux and Apache. But the case of open content production should not be treated as a trivial one. For instance, while the task of maintaining a collection of encyclopedic articles might seem negligible compared to the one of keeping together a software system with its many modules and interdependencies, it still poses quite demanding problems. In this paper, we describe the methods and practices adopted by Simple Wikipedia to keep its articles easy to read. Based on measurements of article readability and similarity, we conclude that while the mechanisms adopted by the community had some effect, in the long run more efforts and new practices might be necessary in order to maintain an acceptable level of readability in the Simple Wikipedia collection.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Evaluation of Occupational Therapy Practices for Artisanal Gold Mining in Bagega Community, Zamfara State, Nigeria

Background: The enormous numbers of people involved in artisanal gold mining (AGM) together with primitive methods being used in processing gold have resulted in health and environmental challenges. Based on this, both the local and international stakeholders in mining and health sectors engaged in therapeutic practice to mitigate the challenges. Methods: Physical observation and soil samples ...

متن کامل

Advertising Keyword Suggestion Using Relevance-Based Language Models from Wikipedia Rich Articles

When emerging technologies such as Search Engine Marketing (SEM) face tasks that require human level intelligence, it is inevitable to use the knowledge repositories to endow the machine with the breadth of knowledge available to humans. Keyword suggestion for search engine advertising is an important problem for sponsored search and SEM that requires a goldmine repository of knowledge. A recen...

متن کامل

Extracting Structured Knowledge for Semantic Web by Mining Wikipedia

Since Wikipedia has become a huge scale database storing wide-range of human knowledge, it is a promising corpus for knowledge extraction. A considerable number of researches on Wikipedia mining have been conducted and the fact that Wikipedia is an invaluable corpus has been confirmed. Wikipedia’s impressive characteristics are not limited to the scale, but also include the dense link structure...

متن کامل

Mining Multiword Terms from Wikipedia

The collection of the specialized vocabulary of a particular domain (terminology) is an important initial step of creating formalized domain knowledge representations (ontologies). Terminology Extraction (TE) aims at automating this process by collecting the relevant domain vocabulary from existing lexical resources or collections of domain texts. In this chapter, the authors address the extrac...

متن کامل

Mining cross-cultural relations from Wikipedia

For many people, Wikipedia represents one of the primary sources of knowledge about foreign cultures. Yet, different Wikipedia language editions offer different descriptions of cultural practices. Unveiling diverging representations of cultures provides an important insight, since they may foster the formation of cross-cultural stereotypes, misunderstandings and potentially even conflict. In th...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008